331 research outputs found

    From phenotype to genotype: issues in navigating the available information resources

    Get PDF
    pre-printObjectives-As part of an investigation of connecting health professionals and the lay public to both disease and genomic information, we assessed the availability and nature of the data from the Human Genome Project relating to human genetic diseases. Methods-We focused on a set of single gene diseases selected from main topics in MEDLINEplus, the NLM's principal resource focused on consumers. We used publicly available websites to investigate specific questions about the genes and gene products associated with the diseases. We also investigated questions of knowledge and data representation for the information resources and navigational issues. Results-Many online resources are available but they are complex and technical. The major challenges encountered when navigating from phenotype to genotype were (1) complexity of the data, (2) dynamic nature of the data, (3) diversity of foci and number of information resources, and (4) lack of use of standard data and knowledge representation methods. Conclusions-Three major informatics issues arise from the navigational challenges. First, the official gene names are insufficient for navigation of these web resources. Second, navigational inconsistencies arise from difficulties in determining the number and function of alternate forms of the gene or gene product and maintaining currency with this information. Third, synonymy and polysemy cause much confusion. These are severe obstacles to computational navigation from phenotype to genotype, especially for individuals who are novices in the underlying science. Tools and standards to facilitate this navigation are sorely needed

    ULearn: personalized medical learning on the web for patient empowerment

    Get PDF
    Health literacy constitutes an important step towards patient empowerment and the Web is presently the biggest repository of medical information and, thus, the biggest medical resource to be used in the learning process. However, at present, web medical information is mainly accessed through generic search engines that do not take into account the user specific needs and starting knowledge and so they are not able to support learning activities tailored to the specific user requirements. This work presents “ULearn” a meta engine that supports access, understanding and learning on the Web in the medical domain based on specific user requirements and knowledge levels towards what we call “balanced learning”. Balanced learning allows users to perform learning activities based on specific user requirements (understanding, deepening, widening and exploring) towards his/her empowerment. We have designed and developed ULearn to suggest search keywords correlated to the different user requirements and we have carried out some preliminary experiments to evaluate the effectiveness of the provided information

    Investigating Implicit Knowledge in Ontologies with Application to the Anatomical Domain

    Get PDF
    This paper investigates implicit knowledge in two ontologies of anatomy: the Foundational Model of Anatomy and GALEN. The methods consist of extracting the knowledge explicitly represented, acquiring the implicit knowledge through augmentation and inference techniques, and identifying the origin of each semantic relation. The number of relations (12 million in FMA and 4.6 million in GALEN), broken down by source, is presented. Major findings include: each technique provides specific relations; and many relations can be generated by more than one technique. The application of these findings to ontology auditing, validation, and maintenance is discussed, as well as the application to ontology integratio

    OWL-based acquisition and editing of computer-interpretable guidelines with the CompGuide editor

    Get PDF
    Computer-Interpretable Guidelines (CIGs) are the dominant medium for the delivery of clinical decision support, given the evidence-based nature of their source material. Therefore, these machine-readable versions have the ability to improve practitioner performance and conformance to standards, with availability at the point and time of care. The formalisation of Clinical Practice Guideline knowledge in a machine-readable format is a crucial task to make it suitable for the integration in Clinical Decision Support Systems. However, the current tools for this purpose reveal shortcomings with respect to their ease of use and the support offered during CIG acquisition and editing. In this work, we characterise the current landscape of CIG acquisition tools based on the properties of guideline visualisation, organisation, simplicity, automation, manipulation of knowledge elements, and guideline storage and dissemination. Additionally, we describe the CompGuide Editor, a tool for the acquisition of CIGs in the CompGuide model for Clinical Practice Guidelines that also allows the editing of previously encoded guidelines. The Editor guides the users throughout the process of guideline encoding and does not require proficiency in any programming language. The features of the CIG encoding process are revealed through a comparison with already established tools for CIG acquisition.COMPETE, Grant/Award Number: POCI-01-0145-FEDER-007043; FCT - Fundacao para a Ciencia e Tecnologia, Grant/Award Number: UID/CEC/00319/201

    A matter of words: NLP for quality evaluation of Wikipedia medical articles

    Get PDF
    Automatic quality evaluation of Web information is a task with many fields of applications and of great relevance, especially in critical domains like the medical one. We move from the intuition that the quality of content of medical Web documents is affected by features related with the specific domain. First, the usage of a specific vocabulary (Domain Informativeness); then, the adoption of specific codes (like those used in the infoboxes of Wikipedia articles) and the type of document (e.g., historical and technical ones). In this paper, we propose to leverage specific domain features to improve the results of the evaluation of Wikipedia medical articles. In particular, we evaluate the articles adopting an "actionable" model, whose features are related to the content of the articles, so that the model can also directly suggest strategies for improving a given article quality. We rely on Natural Language Processing (NLP) and dictionaries-based techniques in order to extract the bio-medical concepts in a text. We prove the effectiveness of our approach by classifying the medical articles of the Wikipedia Medicine Portal, which have been previously manually labeled by the Wiki Project team. The results of our experiments confirm that, by considering domain-oriented features, it is possible to obtain sensible improvements with respect to existing solutions, mainly for those articles that other approaches have less correctly classified. Other than being interesting by their own, the results call for further research in the area of domain specific features suitable for Web data quality assessment

    Natural language analysis of online health forums

    Get PDF
    Despite advances in concept extraction from free text, finding meaningful health related information from online patient forums still poses a significant challenge. Here we demonstrate how structured information can be extracted from posts found in such online health related forums by forming relationships between a drug/treatment and a symptom or side effect, including the polarity/sentiment of the patient. In particular, a rule-based natural language processing (NLP) system is deployed, where information in sentences is linked together though anaphora resolution. Our NLP relationship extraction system provides a strong baseline, achieving an F1 score of over 80% in discovering the said relationships that are present in the posts we analysed

    Domain-independent Extraction of Scientific Concepts from Research Articles

    Get PDF
    We examine the novel task of domain-independent scientific concept extraction from abstracts of scholarly articles and present two contributions. First, we suggest a set of generic scientific concepts that have been identified in a systematic annotation process. This set of concepts is utilised to annotate a corpus of scientific abstracts from 10 domains of Science, Technology and Medicine at the phrasal level in a joint effort with domain experts. The resulting dataset is used in a set of benchmark experiments to (a) provide baseline performance for this task, (b) examine the transferability of concepts between domains. Second, we present two deep learning systems as baselines. In particular, we propose active learning to deal with different domains in our task. The experimental results show that (1) a substantial agreement is achievable by non-experts after consultation with domain experts, (2) the baseline system achieves a fairly high F1 score, (3) active learning enables us to nearly halve the amount of required training data.Comment: Accepted for publishing in 42nd European Conference on IR Research, ECIR 202

    Alignment of the UMLS semantic network with BioTop: Methodology and assessment

    Get PDF
    Motivation: For many years, the Unified Medical Language System (UMLS) semantic network (SN) has been used as an upper-level semantic framework for the categorization of terms from terminological resources in biomedicine. BioTop has recently been developed as an upper-level ontology for the biomedical domain. In contrast to the SN, it is founded upon strict ontological principles, using OWL DL as a formal representation language, which has become standard in the semantic Web. In order to make logic-based reasoning available for the resources annotated or categorized with the SN, a mapping ontology was developed aligning the SN with BioTop. Methods: The theoretical foundations and the practical realization of the alignment are being described, with a focus on the design decisions taken, the problems encountered and the adaptations of BioTop that became necessary. For evaluation purposes, UMLS concept pairs obtained from MEDLINE abstracts by a named entity recognition system were tested for possible semantic relationships. Furthermore, all semantic-type combinations that occur in the UMLS Metathesaurus were checked for satisfiability. Results: The effort-intensive alignment process required major design changes and enhancements of BioTop and brought up s

    A unified framework for managing provenance information in translational research

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A critical aspect of the NIH <it>Translational Research </it>roadmap, which seeks to accelerate the delivery of "bench-side" discoveries to patient's "bedside," is the management of the <it>provenance </it>metadata that keeps track of the origin and history of data resources as they traverse the path from the bench to the bedside and back. A comprehensive provenance framework is essential for researchers to verify the quality of data, reproduce scientific results published in peer-reviewed literature, validate scientific process, and associate trust value with data and results. Traditional approaches to provenance management have focused on only partial sections of the translational research life cycle and they do not incorporate "domain semantics", which is essential to support domain-specific querying and analysis by scientists.</p> <p>Results</p> <p>We identify a common set of challenges in managing provenance information across the <it>pre-publication </it>and <it>post-publication </it>phases of data in the translational research lifecycle. We define the semantic provenance framework (SPF), underpinned by the Provenir upper-level provenance ontology, to address these challenges in the four stages of provenance metadata:</p> <p>(a) Provenance <b>collection </b>- during data generation</p> <p>(b) Provenance <b>representation </b>- to support interoperability, reasoning, and incorporate domain semantics</p> <p>(c) Provenance <b>storage </b>and <b>propagation </b>- to allow efficient storage and seamless propagation of provenance as the data is transferred across applications</p> <p>(d) Provenance <b>query </b>- to support queries with increasing complexity over large data size and also support knowledge discovery applications</p> <p>We apply the SPF to two exemplar translational research projects, namely the Semantic Problem Solving Environment for <it>Trypanosoma cruzi </it>(<it>T.cruzi </it>SPSE) and the Biomedical Knowledge Repository (BKR) project, to demonstrate its effectiveness.</p> <p>Conclusions</p> <p>The SPF provides a unified framework to effectively manage provenance of translational research data during pre and post-publication phases. This framework is underpinned by an upper-level provenance ontology called Provenir that is extended to create domain-specific provenance ontologies to facilitate provenance interoperability, seamless propagation of provenance, automated querying, and analysis.</p
    • 

    corecore